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abstract: The purpose of this study was to determine the effects of an early numeracy preventative 
Tier 2 intervention on the mathematics performance of first-grade students with mathematics dif¬ 
ficulties. Researchers used a pretest-posttest control group design with randomized assignment of 
139 students to the Tier 2 treatment condition and 65 students to the comparison condition. Sys¬ 
tematic instruction, visual representations of mathematical concepts, purposeful and meaningful 
practice opportunities, and frequent progress monitoring were used to develop understanding in 
early numeracy skills and concepts. Researchers used progress-monitoring measures and a standard¬ 
ized assessment measure to test the effects of the intervention. Findings showed that students in the 
treatment group outperformed students in the comparison group on the progress-monitoring mea¬ 
sures of mathematics performance and the measures that focused on whole-number computation. 
There were no differences between groups on the problem-solving measures. 


W ith the reauthorization 
of the Individuals With 
Disabilities Education 
Act (IDEA, 2004; Pub¬ 
lic Law 108-446), states 


are implementing a response-to-intervention 
(RTI) process as a way to identify students with 
learning difficulties at a young age and provide 
intervention services to prevent future learning 
disabilities. A multitiered prevention and inter- 
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vention model that includes universal screening, 
validated interventions, and ongoing monitoring 
of student response to instruction is one means 
for operationalizing RTI to identify those 
students who are most in need of intensive inter¬ 
vention (Vaughn, Wanzek, &C Fletcher, 2007). A 
multitiered approach to early reading intervention 
is widely implemented across school districts na¬ 
tionwide (Vaughn, Wanzek, Woodruff, & Linan- 
Thompson, 2007). Equally important is the 
development and validation of Tier 2 intervention 
protocols as part of RTI early mathematics in¬ 
struction. Educators must have access to vali¬ 
dated, preventative early mathematics Tier 2 
interventions to implement the RTI model with 
students who manifest mathematics difficulties. 

Educators must have access to 
validated, preventative early mathematics 
Tier 2 interventions to implement the 
RTI model with students who 
manifest mathematics difficulties. 

EARLY MATHEMATICS TIER 2 
INTERVENTION 

Recommendations from the National Mathemat¬ 
ics Advisory Panel (NMAP; 2008) underscore the 
importance of providing early intervention that 
employs effective instructional practices, for at- 
risk students. For early mathematics interven¬ 
tions, research results are beginning to inform an 
understanding of the types of instructional prac¬ 
tices and intensity of interventions that con¬ 
tribute to mathematics performance. Studies of 
the effects ofTier 2 mathematics interventions on 
the mathematics performance of at-risk first-grade 
students have produced findings that have impli¬ 
cations for the design and delivery of interven¬ 
tions. For example, in one study, Bryant, Bryant, 
Gersten, Scammacca, and Chavez (2008) deliv¬ 
ered mathematics intervention in small groups 3 
to 4 days per week for 15 min per session for 18 
weeks (total of 1,080 min and 72 sessions). The 
intervention focused on number concepts and op¬ 
erations such as quantity, counting, numerical se¬ 
quencing, basic facts, and place value concepts. 


Although students’ performance in small groups 
indicated that they understood the concepts, the 
study found no significant effect for first-graders 
(n = 26 Tier 2 students) on the mathematics 
progress monitoring measures. The authors hy¬ 
pothesized that students did not have sufficient 
daily time to practice the fundamental numeracy 
concepts to show significant findings on the flu¬ 
ency measures. 

In a follow-up study, Bryant, Bryant, Ger¬ 
sten, Scammacca, Funk et al. (2008) designed a 
first-grade mathematics intervention that focused 
on early numeracy concepts and operations, 
which were similar to those taught in the earlier 
study. The follow-up study included a longer 
duration that consisted of 20-min sessions 4 days 
a week for 23 weeks (total of 1,840 min and 92 
sessions); thus, more practice opportunities across 
the school year were built into the revised inter¬ 
vention as a function of increased intervention 
time. Results showed a significant effect for Tier 2 
intervention for first-grade students (n - 42). 

In yet another first-grade study, Fuchs et al. 
(2005) identified 127 students, from a pool of 
564 first graders, as being at risk for mathematics 
difficulties based on scores from a set of screening 
measures. The identified students received small 
group tutoring 3 times per week for 16 weeks 
with 30 min devoted to numeracy concepts and 
10 min to addition and subtraction facts using 
computer-assisted instruction (CA1; total of 
1,440 min for early numeracy intervention and 
480 min to develop fact fluency using CAI 48 
sessions). Topics for the tutors were almost exclu¬ 
sively related to number concepts and operations. 
Results showed that at-risk students in the treat¬ 
ment group demonstrated performance that was 
statistically significantly better than that of the at- 
risk control group on a standardized measure of 
concepts and applications and story problems. 
On the addition and subtraction fact fluency 
measures and a standardized measure of applied 
problems, however, the treatment and control at- 
risk groups scored comparably. 

Finally, Fuchs et al. (2006) focused on the ef¬ 
ficacy of CAI for developing addition and sub¬ 
traction fluency. Students worked on the 
computer on fact retrieval for 50 sessions of 10 
min each (total of 500 min) over 18 weeks. The 
study found a significant effect for addition num- 
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ber combinations oh a fact-fluency retrieval but 
no effect for subtraction fluency or transfer of 
learning to word problem solving. These authors 
recommended using a stronger instructional 
design that focused on formatting problems verti¬ 
cally, including more pictorial representations, 
and having students practice number combina¬ 
tions with paper-pencil and flash cards to pro¬ 
mote transfer from computer to pencil and paper. 

PERFORMANCE OF YOUNG 
STUDENTS WITH A 
MULTITIERED APPROACH 

Continued research is needed to investigate math¬ 
ematics interventions for struggling students— 
interventions that consist of the critical features of 
instructional design, including sufficient time for 
students to learn early numeracy concepts and 
operations. The purpose of this study was to 
determine the effects of an early numeracy pre¬ 
ventative Tier 2 intervention on the mathematics 
performance of first-grade students with mathe¬ 
matics difficulties. We were also interested in de¬ 
termining whether Tier 2 students with 
mathematics difficulties generalized (transferred) 
their learning in early numeracy concepts, which 
we taught, to distal measures (i.e., progress-moni¬ 
toring measure and a standardized achievement 
test). The following questions and hypotheses 
guided our research: 

1. Did students receiving the early numeracy 
Tier 2 intervention demonstrate improved 
performance on timed progress monitoring 
measures of early numeracy mathematics, 
closely aligned to intervention curricula, 
when compared to students receiving “busi¬ 
ness as usual” mathematics instruction with 
no particular intervention? We hypothesized 
that students in the treatment group would 
outperform students in the “business as 
usual” comparison group. 

2. Did students receiving the early numeracy 
Tier 2 intervention demonstrate improved 
performance on a distal progress monitoring 
measure of problem solving and mixed whole 
number computation on a distal standard¬ 
ized measure (problem solving and proce¬ 


dures [mixed whole-number computation]) 
of mathematics when compared to students 
receiving “business as usual” mathematics in¬ 
struction? We hypothesized that students in 
the treatment group would outperform stu¬ 
dents in the business-as-usual comparison 
group on the mixed whole number computa¬ 
tion distal measures because our intervention 
included a strong computation component. 
We also hypothesized that there would be no 
differences between groups on the distal 
problem-solving measures because we did 
not directly teach the skills and concepts 
(mathematical ideas; domains) measured on 
the problem-solving tests. 

METHOD 

Participants and Research Design 

Sampling Procedures, Risk Assessment, and 
Power Analyses. Two main considerations drove 
sample selection: (a) maintaining sufficient power 
and (b) reliably assessing risk. Of the initial pool 
of students (N = 771), the lowest 35% (n = 269) 
was identified as being “at risk” based on an initial' 
administration of the Texas Early Mathematics 
Inventories-Progress Monitoring measures 
(TEM1-PM; University of Texas System & Texas 
Education Agency, 2007b; refer to the “Measures” 
section of this article for further details about this 
test) in the fall (September). Of the 269 students, 
31 were omitted from consideration because of 
disabilities. Students with disabilities were omit¬ 
ted from the sample because the intervention did 
not provide the level of individualized, intensive 
instruction that is often required to help these 
students master mathematics concepts and skills. 
For the remaining students (n = 238), we 
administered four additional TEMI-PM probes 
(alternate forms of the original measure used for 
student selection) over a 3-week period to deter¬ 
mine whether there were false positives among the 
initial pool of students. False positives are a par¬ 
ticular concern given the generally “chaotic” 
nature of early achievement and the increased 
possibility of falsely identifying students as being 
“at risk” when they were merely distracted, 
anxious, or unfamiliar with the testing protocols. 
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Growth modeling (with continuous outcomes 
and auto-correlated residuals) was used to esti¬ 
mate case-level factor scores for intercept and 
slope for each of the 238 cases using PLUS 4.1 
(preliminary analyses suggested a statistically sig¬ 
nificant positive trend in scores over time, on 
average; thus, a growth model approach was pre¬ 
ferred over a confirmatory factor model). We con¬ 
ceptualized intercept as the last of the four 
additional TEMI-PM measures (beyond the 
TEMI-PM used to initially identify the lowest 
35%). Estimated time 4 scores were used to make 
final sample selection. The cut score was selected 
based on the probabilities of diagnostic accuracy 
(i.e., likelihood ratio [LR]) derived using receiver 
operator curve (ROC) analysis. Using this proce¬ 
dure, we found 14 students to be false positives 
and eliminated them from the sample. 

A concern with accuracy and the need to 
maintain an adequate sample size both influenced 
our sampling strategy. Preliminary power analyses 
suggested a sample size of 240, with 160 in the 
treatment condition and 80 in the comparison 
group. The initial pool of eligible students was 
only 238, so our strategy was to identify students 
who clearly were not at risk, based on their esti¬ 
mated score at time 4 and a conservative risk 
threshold (LR: negative of .70). The final sample 
(n = 224: 151 treatment and 73 control) identi¬ 
fied for treatment and control conditions was as¬ 
sociated with a minimal detectable effect size of 
approximately .40, assuming .80 power and 45 
instructional groups with five students in each 
group. Simple random assignment of students to 
condition was completed using a random number 
generator in Statistical Analysis Software. The re¬ 
search design was a pretest-posttest control group 
design. 

Setting and Demographics. Students in this 
study attended 10 elementary schools in a subur¬ 
ban central Texas community. Like aiiy school 
district, the geographic location of the schools in¬ 
fluenced the demographic characteristics of the 
student population. Our schools included diverse 
student populations where some schools had 
larger percentages of students who received free or 
reduced-price lunch; these were the schools in 
which we typically had more students qualifying 
for the intervention and thus more intervention 
groups of students. The number of intervention 


groups ranged from 2 (three schools) to 6 (two 
schools). We had 3 groups in three schools, 4 
groups in one school, and 5 groups in another 
school. We obtained demographic characteristics 
of the sample from the school district. For the 
treatment group, 50.4% of the students were clas¬ 
sified as economically disadvantaged based on free 
or reduced-price lunch data. For the comparison 
group, 52.3%% were considered economically 
disadvantaged. In the treatment group, 43.9% of 
the students were male and 56.1% were female: 
26.6% were African American, 33.0% were His¬ 
panic, 36.6% were White, and 3.6% were 
Asian/Pacific Islander. In the comparison group, 
55.4% of the students were male and 44.6% were 
female: 21.5% were African American, 40.0% 
were Hispanic, 32.3% were White, and 6.2% 
were Asian/Pacific Islander. 

Attrition. At the end of the school year, the 
sample included 204 first-grade students. Twenty- 
one students (Treatment = 12 treatment [13% of 
the treatment group] and 9 comparison [8% of 
the comparison group]) moved away from the 
district for various reasons during the academic 
year, leaving 204 students: 65 students in the 
comparison group and 139 students in the treat¬ 
ment group. The demographic percentages shown 
in Table 1 are for the postattrition sample. 

Measures 

Screening and Progress Monitoring Measures. 
Our screening and progress-monitoring measure, 
the TEMI-PM, includes four group-administered 
subtests. Magnitude Comparisons (MCs) assesses 
a student’s ability to differentiate the smaller value 
of two numerals displayed side by side. For first- 
grade students, numerals range from zero through 
99, with difficulty increasing as students move 
through items. On Number Sequences (NSs), re¬ 
spondents are presented with two numerals and a 
blank space indicating the missing third numeral 
(e.g.,_ 18 19). The number of correctly iden¬ 

tified missing numbers represents the raw score. 
Place value (PV) uses a format similar to that used 
in many early math textbooks (e.g., Addison- 
Wesley Scott Foresman). Students are presented 
with figures depicting tens and ones (e.g., the 
number 34 is represented by three vertical stacks 
of 10 squares and four single squares) and asked 


to 


Fall 2011 


TABLE 1 


Demographic Characteristics for the Treatment and Comparison Groups 


Characteristic 

Treatment (n = 139) 

Comparison fn = 65) 

Ethnicity 

African American 

26.6% 

21.5% 

Asian 

3.6% 

6.2% 

Native American 

— 

— 

Hispanic 

33% 

40% 

White 

36.6% 

32.3% 

Gender 

Male 

43.9% 

55.4% 

Female 

56.1% 

44.6% 

ELL 

Yes 

5% 

9.2% 

No 

94.9% 

90.8% 

Free/Reduced-Price Lunch 

Neither 

49.6% 

47.7% 

Free/reduced-price lunch 

50.4% 

52.3% 


Note. ELL = English language learner. 


to select their response from four options. Addi¬ 
tion/Subtraction Combinations (ASCs) addresses 
young students’ knowledge of addition and sub¬ 
traction facts from zero through 18. Items appear 
eight to a row, with five rows in all. Each row 
contains four addition problems and four subtrac¬ 
tion problems. For all of the subtests, students 
have 2 min to write answers to as many items as 
possible. The number of correct responses repre¬ 
sents the raw score. More complete descriptions 
of the measures, along with evidence of their reli¬ 
ability and validity, can be found in Bryant, 
Bryant, Gersten, Scammacca, and Chavez (2008) 
or Bryant, Bryant, Gersten, Scammacca, Funk et 
al. (2008). 

Outcome Measures. We administered the 
Stanford Achievement Test-Tenth Edition (SAT- 
10; Pearson, 2003) as one of the distal outcome 
measures to all students. The mathematics por¬ 
tion of the SAT-10 includes the Mathematics 
Problem Solving (MPS) and Mathematics Proce¬ 
dures (MP) subtests in Grade 1 with items that 
assess numeration, numerical sequencing, mea¬ 
surement, statistics, problem solving, and compu¬ 
tation. A composite score (Total Mathematics) is 
also available. In the fall, students in Grade 1 are 
administered the Stanford Early School Achieve¬ 


ment Test 2 level, and in the spring, students are 
administered the Primary-1 level. In the current 
study, the SAT-10 mathematics subtests yielded 
internal consistency reliability coefficients that ex¬ 
ceeded .80. The total score internal consistency 
reliability coefficient exceeded .90. 

The Texas Early Mathematics Inventories- 
Outcome (TEMI-O; University of Texas System 
& Texas Education Agency, 2007a) is a group- 
administered problem-solving and whole-num¬ 
ber computation measure. The TEMI-O was 
considered to be a distal outcome measure be¬ 
cause it assesses all of the state’s standards for 
first-grade instruction (our intervention focused 
only on the number and operation standards). 
The TEMI-O is composed of two subtests: 
Mathematics Problem Solving (MPS) and Math¬ 
ematics Computation (MC). The MPS contains 
39 items that assess number, operation, and 
quantitative reasoning; patterns, relationships, 
and algebraic thinking; geometry and spatial 
reasoning; measurement; probability and statis¬ 
tics; and underlying processes and mathematical 
tools. Teachers read aloud the stimulus item 
prompts, one at a time. The last response choice 
for each item is “Not Shown,” which increases the 
complexity of the item. The MC subtest contains 
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30 items that assess whole-number computation 
skills. Students are given 25 min to complete the 
items. The TEMI-O Total Score reliability, as 
estimated using coefficient alpha, was .86 for 
Form A, .90 for Form B, and .92 for Form C. Ex¬ 
amination of concurrent criterion-related validity 
was conducted by correlating TEMI-O Total 
Scores with the Total Scores obtained by students 
on the SAT-10 and by estimates of student math¬ 
ematics abilities, as rated by their teachers. The 
coefficient for the SAT-10 and Form A for the 
TEMI-O Total Score was .61; teacher ratings cor¬ 
related with TEMI-O Total Scores at .61. Both 
coefficients reflect positively on the validity of the 
TEMI-O scores. 

We used the TEMI-PM as the proximal flu¬ 
ency progress-monitoring measure to answer re¬ 
search Question 1 because of the assessment’s 
alignment with the interventions and the first- 
grade curriculum in the school district. To answer 
research Question 2, we used the SAT-10 and the 
TEMI-O as distal outcome measures because they 
assess mathematical problem solving, which we 
did not directly teach, and mixed whole-number 
computation. Although we taught whole-number 
computation, we did not assess these skills in a 
mixed format. 


PROCEDURES 

Assessment Procedures 

Screening and Benchmark Testing. The TEMI- 
PM (screening) and TEMI-O were administered 
in the fall (September), winter (February), and 
spring (May) by 50 first-grade classroom teachers 
to intact classrooms of students who returned 
signed, affirmative permission slips, in line with 
Institutional Review Board (IRBs) procedures. 
Testing occurred over 3 consecutive days; each 
session, lasted approximately 45 min. We used the 
TEMI-PM as the initial screening measure to 
identify students who scored below the 35th per¬ 
centile. To this pool of students (TV = 238), the 
project staff administered four additional alter¬ 
nate forms of the TEMI-PM probes to continue 
the identification process. The project staff ad¬ 
ministered the SAT-10 to intact classes in May. 
The test was administered across 2 days, with the 


Mathematics Procedures subtest given the first 
day and the Mathematics Problem Solving given 
the next day. 

Training. For the research team, the project 
and assessment coordinators conducted a half-day 
training session on all measures. Administration 
procedures for each of the measures were pre¬ 
sented and modeled. The research team had time 
to practice the administration procedures under 
the direction of the project coordinators. The re¬ 
search team included two full-time intervention 
coordinators and five graduate research assistants 
(GRAs) who were doctoral and masters students 
in the Department of Special Education; all of the 
GRAs held teaching credentials or were complet¬ 
ing a teaching certification program. This research 
team was also responsible for conducting the in¬ 
tervention. The purpose of this training was to 
ensure that the staff was prepared to train the 
classroom teachers and to conduct observations of 
assessment fidelity. 

For the 50 first-grade classroom teachers, the 
two intervention coordinators provided a 1 -hr 
training. Teachers were provided with materials 
for conducting the assessments and with 
“prompt” materials (e.g., tips for administration) 
to ensure fidelity of test administration. Training 
sessions occurred at the beginning of the aca¬ 
demic year, with 1-hr refresher trainings con¬ 
ducted after school or during preparatory periods 
before test administration in the winter and 
spring. 

Fidelity of Assessment Administration. First- 
grade classroom teachers administered the TEMI- 
PM and TEMI-O assessments over 3 days in the 
fall, winter, and spring of the academic year. For 
each of the 3 days of testing, the research team 
conducted fidelity checks by randomly choosing 
10 of the 50 first-grade teachers (total n = 30) for 
observations. Interrater agreement results for 
teacher fidelity were 91.7% in the fall, 97.2% in 
the winter, and 97.2% in the spring. 

Early Numeracy Intervention: 
Treatment Procedures 

Intervention Training. At the beginning of the 
academic year, the principal investigator provided 
a 3-hr training on the intervention lessons and 
accompanying instructional materials. This train- 
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ing consisted of an explanation of the content and 
review and modeling of systematic instruction. 
Following this training, the research team prac¬ 
ticed the lessons with one another. Before inter¬ 
vention, the tutors taught a lesson and received 
feedback from experienced tutors who were using 
the same lessons with a group of students. 
Throughout the school year, training sessions 
were conducted before each intervention unit 
(seven total sessions). 

Description of the Intervention. The early nu¬ 
meracy intervention program focused on number 
and operation mathematical ideas, including 
problem solving, that were drawn from promi¬ 
nent sources on mathematics instruction (e.g., 
Clements &C Sarama, 2009; Curriculum Focal 
Points for Prekindergarten Through Grade 8 Mathe¬ 
matics, National Council of Teachers of Mathe¬ 
matics, NCTM, 2006; NMAP, 2008; National 
Research Council, NRC, 2009). Our goal was to 
help young students engage in activities to pro¬ 
mote conceptual, strategic, and procedural knowl¬ 
edge development for number and operation 
concepts and skills. We included activities that re¬ 
lated to counting (e.g., counting sequence, count¬ 
ing principles), and number knowledge and 
relationships (e.g., comparing the magnitude of 
numbers and quantity and ordering or sequenc¬ 
ing numbers). We also included activities that fo¬ 
cused on partitioning and grouping of tens and 
units (e.g., part-whole, compose and decompose 
numbers), which prepare students for work in 
place value and the base-10 system in later school 
years. Finally, early numeracy instruction should 
also include activities to help students develop a 
conceptual understanding of addition and sub¬ 
traction and the mathematical properties that can 
be used to solve arithmetic combinations. To that 
end, we provided numerous opportunities for stu¬ 
dents to learn about combining and separating 
sets and working with basic facts (e.g., part-part- 
whole; fact families; related facts). See Table 2 for 
more information about the mathematical ideas 
taught in the intervention. 

There were 11 units of instruction; each unit 
included 8 days of lessons. Each instructional day 
included a warm-up and two scripted lessons. The 
warm-up was 3 min and consisted of fluency ac¬ 
tivities on previously taught skills (e.g., reading 
and writing numerals within a certain range, 


practicing addition and subtraction facts). Each of 
the two daily lessons was 10 min in length. Time 
was allowed to transition between lessons after in¬ 
dependent practice for the lesson. The lessons fo¬ 
cused on developing conceptual knowledge by the 
teachers “thinking aloud” to demonstrate how to 
solve problems and by the teacher and students’ 
using concrete (e.g., base-ten models, connecting 
cubes) and visual representations (e.g., number 
lines, ten frames, hundreds charts, fact cards) ro 
model problems and show relationships (Gersten, 
Chard et al., 2009). Students also learned specific 
cognitive strategies (e.g., count on, doubles + 1, 
make 10 + more, fact families) as a way to solve 
different types of problems more efficiently 
(Clements &C Sarama, 2009; Woodward, 2006). 

The instructional design of the lessons in¬ 
cluded the critical features of systematic interven¬ 
tion that have been validated in numerous studies 
with struggling students (e.g., Swanson, Hoskyn, 
& Lee, 1999). The features included a teaching 
routine consisting of modeling, guided practice, 
and independent practice (progress monitoring); 
error correction procedures; pacing; opportunities 
for meaningful practice (e.g., with visual represen¬ 
tations); examples; and review. Daily progress 
monitoring was conducted where students were 
given a short amount of time to work indepen¬ 
dently to solve problems that were the focus of in¬ 
struction. Number correct and incorrect were 
entered into a daily check-up sheet and examined 
to determine student progress. At the conclusion 
of each unit, a unit check was conducted on rep¬ 
resentative items across the lessons taught in the 
unit. These data were graphed for examination of 
student response to the intervention. 

Behavior Management System. The behavior 
management system was an interdependent 
group-oriented contingency system (Litow & 
Pumroy, 1975); that is, all of the students of the 
tutoring group had to meet the criterion (i.e., be 
“Math Ready”) of the contingency before earning 
reinforcement (Cooper, Heron, & Heward, 
2007). Math Ready consisted of five behaviors: 
eyes on the teacher, listening, ready to learn, 
mouth quiet (no off-task talking), and hands on 
table. The expected behaviors were taught to stu¬ 
dents at the beginning of the year and reviewed 
after breaks (e.g., winter break). Students were 
expected to demonstrate these behaviors at the 
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TABLE 2 

Early Numeracy Curriculum 

• Counting: rote, Counting up/back 

• Number recognition and writing: 0-99 

• Comparing and grouping numbers 

• Number relationships of more, less 

• Relationships of 1 and 2 more than/less than 

• Part-part-whole relationships (e.g., ways to 
represent numbers) 

• Numeric sequencing (ordering) 

• Making and counting: Groups of tens and ones 

• Using base-ten (2 tens, 6 ones) and standard 
language (26) to describe place value 

• Reading and writing numbers to represent base-ten 
models 

• Counting and decomposition strategies (e.g., 
addition: count on [+ 1, + 2, +3 ], doubles [6 + 6] 
doubles + 1 [6 + 5], make 10 + more [9 + 5]: 
subtraction: count back/down [-1, -2, -3]); fact 
families 

• Properties of addition (commutativity and 
associativity) 


beginning of the day’s lessons and during the 
lessons, as appropriate (e.g., obviously, “hands on 
table” was not expected while students were 
engaged in activities). The math tutor used 5 to 
10 marbles to intermittently reinforce the group 
when they were exhibiting Math Ready behaviors 
during the day’s lessons. If the students earned all 
of the marbles, they were rewarded with stickers 
and small items such as pencils or pencil erasers. 
Students were reminded to be Math Ready at the 
beginning of each day’s lessons and during transi¬ 
tions (e.g., between lessons, during materials dis¬ 
tribution). 

Tutoring Program. Tutoring sessions occurred 
4 days per week for 25 min per session across 19 
weeks (total of 1,900 min and 76 sessions). Small 
groups of three to five students at each of the 10 
schools were formed based on the beginning-of- 
the-year scores from the TEMI-PM assessment 
and teachers’ schedules. Students who qualified 
for the treatment condition were pulled from one 
to six classes per school. Thus, tutoring groups 
were formed based as much as possible on assess¬ 
ment results but also on classroom teachers’ 
schedules. However, due to scheduling issues in 


individual classrooms, some groups had to change 
to accommodate teachers’ requests (these changes 
occurred two to four times during the year de¬ 
pending on the school). Other changes in groups 
occurred during the year due to behavior con¬ 
cerns and ability levels (these changes occurred 
two to five times). A trained tutor from the re¬ 
search team delivered the intervention daily in 
whatever setting each school could find for small 
group daily intervention. Thus, tutoring occurred 
in a classroom, in a library, on a stage, and in the 
book room. 

Fidelity of Treatment 
Implementation and Nature of 
Comparison Services 

Fidelity of Implementation. Each tutor was 
observed for three sessions during the 19-week 
intervention to assess the quality (i.e., fidelity) of 
specific implementation performance indicators. 
Quality of Implementation (Qol) indicators 
included the degree to which tutors did the 
following: 

• Followed the scripted lessons (e.g., modeling, 
guided practice, independent practice). 

• Implemented the features of explicit, system¬ 
atic instruction (e.g., pacing, error correc¬ 
tion). 

• Managed student behavior (e.g., use of rein¬ 
forcers and redirection). 

• Managed the lesson (e.g., use of timer, 
smooth transitions between booster lessons). 

Performance indicators were rated on a zero to 3 
point scale, in which zero = Not at All, 1 = Rarely, 
2 = Some of the Time, and 3 = Most of the Time. 
Results were shared with the tutors and areas in 
need of further training and recommendations for 
improved performance were discussed. Results on 
the Qol showed average ratings exceeding 2.5 in 
all areas, with no single rating of < 2.0. The ma¬ 
jority of ratings were 3.0. These results across tu¬ 
tors show that there was a high degree of fidelity 
in the implementation of the booster lessons. 

Observations of Teachers of the Comparison 
Condition. A research consultant for the project 
conducted the observations of the general educa¬ 
tion teachers. Our consultant was trained on the 


14 


Fall 2011 



intervention and was highly skilled in behavior 
management. One first-grade teacher at each of 
the nine campuses was randomly selected for the 
classroom observation of “business as usual” 
(BAU) intervention for the comparison groups. 
Teachers at the 10th school chose not to partici¬ 
pate in this aspect of the study. Observations oc¬ 
curred from 30 min to 1 hr, depending on the 
length of each teacher’s lesson. 

We used an observation rating scale for data- 
collection purposes, ranging from 3 for Most of 
the Time to zero for Not at All. We specifically 
chose items for data collection that aligned with 
items chosen for the fidelity observation of the 
treatment tutors so that we could compare results 
on similar indicators across the conditions. The 
scale included sections on teacher behavior for 
intervention, instruction, progress monitoring, 
student behavior management, and lesson man¬ 
agement. Results on the Qol showed the follow¬ 
ing mean scores: (a) overall math instruction: 
1.56, (b) teaching the lesson: 2.33, (c) imple¬ 
menting instructional procedures: 2.51, (d) moni¬ 
toring student progress: 2.43, (e) managing 
student behavior: 2.09, and (0 managing the les¬ 
son: 2.78. Thus, comparatively speaking, tutors in 
the treatment condition demonstrated higher rat¬ 
ings than teachers in the comparison condition 
on indicators Of instruction and management that 
are crucial for intervention work. 

Anecdotally, the BAU did not contain any 
well-defined treatment for Tier 2 students. 
Rather, the research consultant noted a variety of 
groupings arid instructional materials (e.g., ma¬ 
nipulative, worksheets). For example, it was 
noted that most of the teachers used small-group 
instruction to work with the comparison stu¬ 
dents. Group size varied from pairs of students, to 
small groups of three to five students, to larger 
groups of seven or more. No explicit, systematic 
mathematics instruction was observed with the 
struggling students; rather, the teachers focused 
on completing the whole-class assignment in a 
smaller group, through centers, or by reviewing 
for upcortiing assessments. One teacher provided 
packets of work that were differentiated, based on 
students’ academic levels, and another teacher 
paired higher-performing students with lower- 
performing students. Instructional pacing varied 
across teachers; students sometimes appeared dis¬ 


engaged in the smaller groups when the pacing 
was slower. Some students in the classroom did 
not behave appropriately while the teacher 
worked with the small groups of students. The 
research consultant nOted some high-quality 
whole-class core instruction and some use of 
small-group instruction. 

In examining the intervention approaches 
conducted by the tutors in the treatment condi¬ 
tion and the teachers in the comparison condi¬ 
tion, some obvious differences between the 
conditions could be viewed as “value added” for 
the intervention conditiori. First, our intervention 
consistently included structured lessons with sys¬ 
tematic instruction, scripted lessons to promote 
pacing, and carefully sequenced lessons. We em¬ 
ployed a concrete-semi-concrete-abstract proce¬ 
dure to help build conceptual knowledge, 
including the use of visual representations, which 
are supported in the literature as important com¬ 
ponents of instruction (Gersten, Beckmann et al., 
2009). Our intervention provided multiple prac¬ 
tice opportunities, which is sorely lacking in the 
general education classroom, and another critical 
factor of systematic instruction. During the obser¬ 
vations, we did not see evidence of systematic 
instruction, which is well documented as an es¬ 
sential component of instruction for struggling 
students. Also, we conducted progress monitoring 
(independent practice) as part of every lesson. 
Although the general education teachers did 
incorporate checking for understanding, in some 
cases, into instruction, our progress monitoring 
was systematic, including self-correcting (error 
correction) by students. 

RESULTS 

Ail variables were within normal limits based on a 
review of normal probability plots by Chambers, 
Cleveland, Kleiner, and Tukey (1983). Table 3 
shows descriptive statistics for fall and spring re¬ 
sults on the TEMI-PM and the TEMI-O. No sig¬ 
nificant differences were found between groups in 
the fall on the TEMI-PM and TEMI-O. Table 4 
shows means and standard deviations for the two 
groups for the SAT-10, which was administered 
only in the spring. Estimates of clustering due to 
school or tutor suggested minimal effects. 
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TABLE 4 

Means and Standard Deviations for Spring Scores on the SAT-10 


Measure 


Comparison Group 



Treatment Group 


N 

M 

SD 

N 

M 

SD 

SAT-10 MPS 

64 

88.75 

11.16 

139 

89.61 

11.58 

SAT-10 MP 

64 

92.56 

13.59 

139 

95.70 

13.24 

SAT-10 TS 

64 

89.84 

11.75 

139 

91.61 

11.95 


Note. SAT-10 = Stanford Achievement Test-Tenth Edition; MPS = Mathematics Problem Solving subtest; 
MP = Mathematics Procedures subtest; TS = Total Score. 


Accordingly, the data were analyzed as a single- 
level model. 

A series of analyses of covariance (ANCOVA) 
with the TEMI-PM fall Total Score as the covari¬ 
ate was used to evaluate statistical differences be¬ 
tween groups and to maximize power of the 
design. The increased Type I error rate associated 
with multiple comparisons was addressed using 
the Benjamini-Hochberg (1995) correction, 
which controls for false discovery rate. 

Two procedures were conducted, one to eval¬ 
uate statistical significance of the eight noncom¬ 
posite scores (i.e., TEMI-PM: MC, NS, PV, and 
ASC; SAT-10: MPS and MP; TEMI-O: compu¬ 
tation and problem solving) and the other to eval¬ 
uate group difference on the composite scores of 
the TEMI-PM Total Score, TEMI-O Total Score, 
and SAT-10 Total Score (because composite scores 
are the sum of two or more noncomposite mea¬ 
sures, the procedures were separated to maintain 
independence of observations). Benjamini- 
Hochberg does not produce a new /i-value. In¬ 
stead, it indicates whether a given finding is 
significant at the specified level after correcting 
for multiple comparisons according to pf = z'a/M, 
where i is the rank of pj, the original /(-value, M is 
the total number of findings within the domain, 
and a is the target /(-value. 

Assumptions regarding homogeneity of re¬ 
gression were evaluated for all outcomes. There 
were no violations. We calculated Hedges g (g*) 
for small sample sizes. Differences in adjusted 
posttest means were standardized using the 
pooled within-groups standard deviation (Hedges 

&01kin, 1985). 

Results for research Question 1 showed sta¬ 
tistically significant differences (adjusted for Type 


1 error based on Benjamini-Hochberg with M = 8 
and a = .05) in favor of the treatment group on 
the Addition and Subtraction Combinations (p = 
< .0001; g* = .55), Place Value (p = < .002; £ = 
.39), Number Sequences (p = < .00001; £ = .47), 
and the TEMI-PM Total Score (p < .0\;£ = .50). 
No differences were found on the Magnitude 
Comparisons subtest (p = .16; £ = .18). 

On research Question 2, there were statisti¬ 
cally significant differences on TEMI-O Compu¬ 
tation {p = .001; g* = .44) and on SAT-10 
Mathematics Procedu res (p = .05; g* = .23), 
though this latter difference was not statistically 
significant after Benjamini-Hochberg adjustment 
for Type I error. There were no statistically signifi¬ 
cant differences on TEMI-O Problem Solving {p 
= -99; £ = -.05) or on the SAT-10 Mathematics 
Problem Solving subtest {p = .32; g* = .07). 
Groups differed on the TEMI-O Total Score at p 
= .05 (£ = .21); however, this difference did not 
meet the requirements for significance after con¬ 
trolling for Type I error (M = 3, a =.05). Differ¬ 
ences on the SAT-10 Total Score {p = .14; £ = 
.15) were not statistically significant (see Table 5). 

DISCUSSION 

This experimental study sought to determine 
whether an intervention provided in first grade to 
students demonstrating overall low early numer¬ 
acy and computation performance would be asso¬ 
ciated with improved outcomes, compared with 
students randomized to a comparison condition. 
We hypothesized that students in the treatment 
condition would outperform comparison 
students on the TEMI-PM Total Score (proximal 
measure). Findings revealed that students in the 
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TABLES 

■t 

Posttest Results by Outcome Measure for the Comparison and Treatment Groups 


Measure 

Adjusted Posttest Means 

Comparison Treatment 
Group Group 

(n = 64) (n = 139) 

F 

Adjusted 

Sig. 

Hedges’ g 

(gV 

SAT-10 Mathematics Procedures 

92.19 

95.87 

3.65 

.05 

.23 

SAT-10 Mathematics Problem Solving 

88.27 

89.83 

1.01 

.32 

.07 

SAT-10 Total Score 

89.39 

91.82 

2.19 

.14 

.15 

TEMI-O Mathematics Computation 

16.46 

19.15 

12.11 

< .01 

.44 

TEMI-O Mathematics Problem Solving 

26.87 

26.85 

0.00 

.99 

-.05 

TEMI-O Total Score 

43.35 

46.10 

3.82 

.05 

.21 

TEMI-PM Magnitude Comparisons 

32.99 

34.62 

2.02 

.16 

.18 

TEMI-PM Number Sequences 

15.70 

18.98 

13.78 

< .01 

Al 

TEMI-PM Place Value 

, h 

15.62 

17.88 

9.72 

<.01 

.39 

TEMI-PM Addition Subtraction Combinations 

13.68 

17.58 

16.34 

< .01 

.55 

TEMI-PM Total Score 

78.00 

89.1 

14.94 

< .01 

.50 


Note. SAT-10 = Stanford Achievement Test-Tenth Edition; TEMI-O = Texas Early Mathematics Inventories-Outcome; 
TEMI-PM = Texas Early Mathematics Inventories-Progress Monitoring; partial eta-squared is an effect-size estimator 
based on the proportion of total variation attributable to the factor, excluding other factors from the total nonerror varia¬ 
tion; Hedges’ g is a standardized mean difference estimator with the variance estimate corrected for bias. 


treatment condition outperformed comparison 
students by .5 of a standard deviation and 
demonstrated statistically significantly higher 
scores than comparison students on the TEMI- 
PM Total Score and three of the four subtests 
(there were no differences between groups on the 
Magnitude Comparisons subtest). Thus; these 
findings confirm our hypothesis. This is educa¬ 
tionally significant and clinically meaningful per 
the guidelines provided by the Institute of Educa¬ 
tion Sciences What Works Clearinghouse 
(http://ies.ed.gov/ncee/wwc/pdf/wwc 
_version l_standards.pdf). 

On closer scrutiny of the TEMI-PM subtest 
scores, we found significant effects for PV in favor 
of the treatment group. The findings are encour¬ 
aging with respect to the effects of intervention 
activities designed to teach relationships of tens 
and ones, particularly because no significant ef¬ 
fects were detected bn the PV subtest results in an 
earlier study (Bryant, Bryant, Gersten, Scam- 
macca, Funk et al., 2008). 

Additionally, we were interested in examin¬ 
ing how students in the treatment condition per¬ 


formed on arithmetic combinations (i.e., basic 
facts) because automatic retrieval of arithmetic 
combinations has been identified as a hallmark of 
mathematics difficulties (Bryant, Bryant, & 
Hammill, 2000; Bryant, Bryant, Williams, Kim, 
& Shin, in press; Geary, 2004; Gersten, Jordan, 
& Flojo, 2005; Siegler, 2007). Fluency develop¬ 
ment was incorporated into daily practice and 
warm-up activities. The positive effects (g* = .55) 
for the ASC subtest of the TEMI-PM suggest that 
compared to the comparison group, the activities 
proved beneficial. 

Finally, we were disappointed by the results 
on the Magnitude Comparisons subtest of the 
TEMI-PM, which is an area that warrants closer 
examination. Compatible number pairs (e.g., 32 
and 46: the digit in the ones place in the smaller 
numeral is. less thin the digit in the ones place in 
the larger numeral_[2 < 6 and 32 < 46]) and in¬ 
compatible number pairs (e.g., 63 and 57: the 
digit in the ones place in the smaller numeral is 
greater than the digit in the ones place in the 
larger numeral [7 > 3 but 57 < 63]) could have 
been a contributing factor to slowing response 
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time if students were not paying attention to the 
value of each digit. For example, Nuerk, Kauf- 
mann, Zoppoth, and Willmes (2004) and Nuerk, 
Weger, and Willmes (2001) hypothesized that 
students with mathematics difficulties may ex¬ 
hibit slow response rates when examining decade- 
unit incompatibility to discriminate quantities.- 
Their research on the compatibility effect in¬ 
cluded students only as young as second grade; 
thus, we do not know how this unit-decade com¬ 
patibility effect is manifested in younger students. 
Also, the subtest contained pairs of numbers close 
to each other on the number line (e.g., 34 and 
38) and pairs of numbers further apart (e.g., 22 
and 68). The ability to more accurately and 
quickly discriminate quantitative differences be¬ 
tween two numerals with larger distances between 
them is called the distance effect (Dehaene, 
Dupoux, & Mehler, 1990; Nuerk et ah, 2004). 
Conceivably, students with mathematics difficul¬ 
ties may have more problems discriminating 
quantities that are close to each other on the 
number line. 

We also hypothesized that students in the 
treatment condition would outperform compari¬ 
son students on the SAT-10 Mathematics Proce¬ 
dures subtest and the TEMI-0 Mathematics 
Computation subtest because these subtests were 
more closely aligned with our basic facts and 
mixed whole-number computation lessons. Al¬ 
though findings for the SAT-10 Mathematics Pro¬ 
cedures were not significant when adjusting for 
Type I error, the effect size was (g* = .23) and the 
p value from the ANCOVA was .05. Also, the 
TEMI-O Mathematics Computation subtest had 
a treatment effect of g* = .44. These findings are 
educationally significant, and our hypothesis was 
confirmed. 

Our findings are similar to those of other 
studies (Fuchs et al., 2006; Fuchs, Fuchs, & Hol¬ 
lenbeck, 2007) that demonstrated significant 
findings for a preventative first-grade tutoring 
program with a strong number, operation, and 
quantitative reasoning component. Compared to 
our previous studies, we would argue that the in¬ 
creased length of the tutoring sessions (daily and 
total time); the features of carefully constructed 
problems with multiple, visual representations; 
and the purposeful and meaningful practice (e.g., 
review) contributed to the overall effects found in 


this study on the TEMI-PM, the SAT-10 Mathe¬ 
matics Procedures, and the TEMI-O Computa¬ 
tion subtests. 

We also examined whether treatment stu¬ 
dents would outperform comparison students on 
the distal measure of mathematics problem solv¬ 
ing on the SAT-10 and the TEMI-O. There were 
no statistical differences between groups on either 
of these measures on problem solving, as pre¬ 
dicted. We did not anticipate between-group dif¬ 
ferences because our curriculum did not direcdy 
teach problem solving in the manner in which it 
was measured on either subtest. Thus, our hy¬ 
pothesis about no significant differences between 
groups was also confirmed. 

By the end of first grade, 45% of 
treatment students and22% of 
comparison students were no longer 
at risk for mathematics difficulties. 

Next, we examined the findings from yet an¬ 
other perspective. In addition to statistical and 
practical effects, we were interested in measuring 
clinical effects (Thompson, 2002). For the pur¬ 
poses of this study, we defined clinical effects as 
the percentage of students who moved out of the 
risk category, based on their end-of-year mathe¬ 
matics scores. By the end of first grade, 45% of 
treatment students and 22% of comparison stu¬ 
dents were no longer at risk for .mathematics diffi¬ 
culties, as determined by the results on the spring 
TEMI-PM. We were pleased with the percentage 
of students who were eligible to exit Tier 2 inter¬ 
vention and the apparent effect of the interven¬ 
tion to reduce the percentage of students with 
mathematics difficulties. The risk status of these 
students in the fall of the following year remains 
to be determined. It is important to determine 
whether the effects of the preventative first-grade 
tutoring for the “responders” to the intervention 
were maintained in subsequent years, as the de¬ 
mands of the mathematics curriculum increase 
(Fuchs et al., 2005). Additionally, it is important 
to identify how the remaining 55% of the Tier 2 
students with at-risk status at the conclusion of 
the academic year fared the following year. 
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Finally, we thought it was important to 
examine campus-level factors that affect inter¬ 
vention research. We anecdotally examined 
whether a preventative mathematics intervention 
would be feasible to implement within the real- 
world context of schooling and whether schools 
and teachers would accommodate the time for the 
intervention from their daily instructional sched¬ 
ule. In each of our 10 schools, we formed tutor¬ 
ing groups that included students from different 
first-grade classes. We needed to work closely with 
teachers to identify mutually agreed-on times 
when we could pull students from their classes, 
which was somewhat challenging. We found in 
our next study, however, that working with cen¬ 
tral office administration, the principals, and the 
teachers in the spring of the year before the im¬ 
plementation of tutoring was a reasonable solu¬ 
tion to the scheduling challenges. 

In sum, the findings indicate that students 
who participated in the intervention compared to 
students from the same classes and schools who 
did not participate, performed statistically signifi¬ 
cantly better on the progress monitoring measure 
(i.e., TEMI-PM) closely aligned with the inter¬ 
vention and the progress monitoring distal mea¬ 
sure (i.e., Mathematics Computation of the 
TEMI-O), with less robust findings for the SAT- 
10. Moreover, the percentage of treatment stu¬ 
dents compared to comparison students who were 
no longer eligible for Tier 2 intervention suggests 
that interventions can potentially reduce the 
number of students at risk for mathematics diffi¬ 
culties by the end of first grade. 

Limitations 

The implementation of the Tier 2 intervention 
program by our research staff on a pullout basis 
was a limitation of this study. When the goal is to 
validate an intervention, trained research staff 
must be responsible for implementation. Studies 
are needed, however, in which general education 
teachers and interventionists conduct the inter¬ 
vention to determine the practicality of the pro¬ 
gram and the effect on students’ mathematics 
performance. Scaling up research to classroom 
teachers and interventionists to provide Tier 2 in¬ 
tervention must be conducted and replicated to 


help us learn what makes sense for classroom 
implementation. 

Future Research 

Future research studies are warranted in several 
areas. First, most multitiered models are based on 
the premise that students in Tier 2 intervention 
have participated in a rigorous, research-based 
Tier 1 program and that these students are at risk 
for reasons other than poor classroom instruction. 
Studies are needed to document the nature and 
effects of Tier 1 mathematics instruction for 
young students. Certainly future research that ex¬ 
amines the effectiveness of Tier 2 interventions 
within the context of robust Tier 1 instruction is 
needed. 

Second, studies are needed to further exam¬ 
ine the unit-decade compatibility and distance ef¬ 
fects with younger students to determine the 
developmental nature of these numerical repre¬ 
sentations. It is conceivable that more instruc¬ 
tional attention needs to be provided to those 
students who have slow response rates in discern¬ 
ing and understanding differences in quantities. 

Third, longitudinal studies are warranted to 
examine the mathematics performance of stu¬ 
dents who previously received Tier 2 intervention 
in first grade. It is important to follow students 
who exited from Tier 2 in first grade, students 
who remained in Tier 2, and students who quali¬ 
fied for Tier 3 in second and subsequent grades to 
determine the effects of intervention and whether 
mathematics difficulties continue. 

Educational Implications 

In the absence of widespread, evidence-based Tier 
2 mathematics interventions for young struggling 
students, we think that schools can begin to take 
steps to provide services to those whose needs re¬ 
quire immediate help. First-grade teachers should 
conduct systematic progress monitoring on essen¬ 
tial mathematical ideas that they are responsible 
for teaching. Information about progress-moni¬ 
toring tools can be found, for example, at the Na¬ 
tional Center on Response to Intervention’s web 
site (www.rti4success.org/). Also, according to the 
findings from this study and others (e.g., Fuchs et 
al., 2005), small-group instruction is a necessary 
component of early mathematics intervention. 
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General education teachers or mathematics inter¬ 
ventionists could conduct the intervention with 
supported coaching, as needed. We found in 
other studies (e.g., Bryant, Roberts, & Bryant, 
2010) that general education teachers, for the 
most part, value support as they try to implement 
Tier 2 interventions that are new for them. 

Information about progress-monitoring 
tools can be found at the website of 
the National Center on Response to 
Intervention (www. rti4success. org/). 

Finally, our Tier 2 first-grade mathematics 
intervention involved an increased amount of in¬ 
structional time, compared to our earlier studies; 
mathematical models (e.g., visual representa¬ 
tions); activities to support student engagement; 
and systematic instruction to develop conceptual 
knowledge and procedural fluency and auto- 
maticity. Overall, findings from this study sup¬ 
port the use of these intervention procedures to 
help young, at-risk students improve their mathe¬ 
matical performance. 
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